0%

Elasticsearch文件存储

分析Elasticsearch Index文件是如何存储的?
主要是想看一下FST文件是以什么粒度创建的?

首先通过kibana找一个索引的shard,此处咱们就以logstash-2023.05.30索引为例

查看下shard分布情况

1
2
3
4
5
6
7
8
9
10
11
GET /_cat/shards/logstash-2023.05.30?v


index shard prirep state docs store ip node
logstash-2023.05.30 3 p STARTED 1520736 408.1mb 10.138.40.73 10.138.40.73-node1
logstash-2023.05.30 5 p STARTED 1520888 409.9mb 10.138.40.74 10.138.40.74-node1
logstash-2023.05.30 6 p STARTED 1518331 408.2mb 10.138.40.221 10.138.40.221-node1
logstash-2023.05.30 4 p STARTED 1518186 409.3mb 10.138.204.194 10.138.204.194-node1
logstash-2023.05.30 1 p STARTED 1519231 408.8mb 10.138.40.220 10.138.40.220-node1
logstash-2023.05.30 2 p STARTED 1519970 409.9mb 10.138.204.195 10.138.204.195-node1
logstash-2023.05.30 0 p STARTED 1520024 410.6mb 10.138.204.193 10.138.204.193-node1

这里以位于10.138.204.193上的shard 0为例分析。

要找到存储目录先要找到index的id

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
GET /logstash-2023.05.30/_settings

{
"logstash-2023.05.30" : {
"settings" : {
"index" : {
"codec" : "best_compression",
"routing" : {
"allocation" : {
"include" : {
"_tier_preference" : "data_content"
}
}
},
"refresh_interval" : "60s",
"number_of_shards" : "7",
"provided_name" : "logstash-2023.05.30",
"creation_date" : "1685376005206",
"number_of_replicas" : "0",
"uuid" : "FYWtFGTIS2CLB8yJhFXG9g",//这里就是索引的id
"version" : {
"created" : "7130499"
}
}
}
}
}

登录机器,找到存储索引文件的对应目录

1
/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g

展开一下该目录下的文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
root@prd-paas-es-01:/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g# tree -C -s
.
├── [ 4096] 0
│ ├── [ 20480] index
│ │ ├── [ 158] _17f.fdm
│ │ ├── [ 25578562] _17f.fdt
│ │ ├── [ 1939] _17f.fdx
│ │ ├── [ 4636] _17f.fnm
│ │ ├── [ 7981735] _17f.kdd
│ │ ├── [ 20898] _17f.kdi
│ │ ├── [ 716] _17f.kdm
│ │ ├── [ 7945983] _17f_Lucene80_0.dvd
│ │ ├── [ 3916] _17f_Lucene80_0.dvm
│ │ ├── [ 6230127] _17f_Lucene84_0.doc
│ │ ├── [ 3875001] _17f_Lucene84_0.pos
│ │ ├── [ 7448815] _17f_Lucene84_0.tim
│ │ ├── [ 108786] _17f_Lucene84_0.tip
│ │ ├── [ 1637] _17f_Lucene84_0.tmd
│ │ ├── [ 593] _17f.si
│ │ ├── [ 158] _3uv.fdm
│ │ ├── [ 33652243] _3uv.fdt
│ │ ├── [ 2555] _3uv.fdx
│ │ ├── [ 4636] _3uv.fnm
│ │ ├── [ 10520395] _3uv.kdd
│ │ ├── [ 27689] _3uv.kdi
│ │ ├── [ 716] _3uv.kdm
│ │ ├── [ 10573208] _3uv_Lucene80_0.dvd
│ │ ├── [ 3916] _3uv_Lucene80_0.dvm
│ │ ├── [ 8298061] _3uv_Lucene84_0.doc
│ │ ├── [ 5154427] _3uv_Lucene84_0.pos
│ │ ├── [ 9716222] _3uv_Lucene84_0.tim
│ │ ├── [ 142063] _3uv_Lucene84_0.tip
│ │ ├── [ 1620] _3uv_Lucene84_0.tmd
│ │ ├── [ 593] _3uv.si
│ │ ├── [ 158] _5bg.fdm
│ │ ├── [ 16433011] _5bg.fdt
│ │ ├── [ 1259] _5bg.fdx
│ │ ├── [ 4636] _5bg.fnm
│ │ ├── [ 5158094] _5bg.kdd
│ │ ├── [ 13396] _5bg.kdi
│ │ ├── [ 716] _5bg.kdm
│ │ ├── [ 5140762] _5bg_Lucene80_0.dvd
│ │ ├── [ 3916] _5bg_Lucene80_0.dvm
│ │ ├── [ 4005897] _5bg_Lucene84_0.doc
│ │ ├── [ 2583880] _5bg_Lucene84_0.pos
│ │ ├── [ 4873082] _5bg_Lucene84_0.tim
│ │ ├── [ 70979] _5bg_Lucene84_0.tip
│ │ ├── [ 1593] _5bg_Lucene84_0.tmd
│ │ ├── [ 593] _5bg.si
│ │ ├── [ 158] _60h.fdm
│ │ ├── [ 24664753] _60h.fdt
│ │ ├── [ 1886] _60h.fdx
│ │ ├── [ 4636] _60h.fnm
│ │ ├── [ 7640438] _60h.kdd
│ │ ├── [ 19996] _60h.kdi
│ │ ├── [ 716] _60h.kdm
│ │ ├── [ 7754954] _60h_Lucene80_0.dvd
│ │ ├── [ 3916] _60h_Lucene80_0.dvm
│ │ ├── [ 6147241] _60h_Lucene84_0.doc
│ │ ├── [ 3998559] _60h_Lucene84_0.pos
│ │ ├── [ 7254035] _60h_Lucene84_0.tim
│ │ ├── [ 105673] _60h_Lucene84_0.tip
│ │ ├── [ 1719] _60h_Lucene84_0.tmd
│ │ ├── [ 593] _60h.si
│ │ ├── [ 200] _7jq.fdm
│ │ ├── [ 63208093] _7jq.fdt
│ │ ├── [ 4692] _7jq.fdx
│ │ ├── [ 4636] _7jq.fnm
│ │ ├── [ 19306117] _7jq.kdd
│ │ ├── [ 51562] _7jq.kdi
│ │ ├── [ 716] _7jq.kdm
│ │ ├── [ 20228561] _7jq_Lucene80_0.dvd
│ │ ├── [ 3916] _7jq_Lucene80_0.dvm
│ │ ├── [ 15606568] _7jq_Lucene84_0.doc
│ │ ├── [ 9581341] _7jq_Lucene84_0.pos
│ │ ├── [ 17383473] _7jq_Lucene84_0.tim
│ │ ├── [ 272615] _7jq_Lucene84_0.tip
│ │ ├── [ 1592] _7jq_Lucene84_0.tmd
│ │ ├── [ 593] _7jq.si
│ │ ├── [ 437] _82w.cfe
│ │ ├── [ 4489379] _82w.cfs
│ │ ├── [ 408] _82w.si
│ │ ├── [ 437] _87w.cfe
│ │ ├── [ 4932636] _87w.cfs
│ │ ├── [ 408] _87w.si
│ │ ├── [ 437] _8ao.cfe
│ │ ├── [ 13905317] _8ao.cfs
│ │ ├── [ 408] _8ao.si
│ │ ├── [ 437] _8ls.cfe
│ │ ├── [ 20181047] _8ls.cfs
│ │ ├── [ 408] _8ls.si
│ │ ├── [ 437] _8nq.cfe
│ │ ├── [ 1234712] _8nq.cfs
│ │ ├── [ 408] _8nq.si
│ │ ├── [ 437] _8oa.cfe
│ │ ├── [ 872798] _8oa.cfs
│ │ ├── [ 408] _8oa.si
│ │ ├── [ 437] _8pp.cfe
│ │ ├── [ 1593677] _8pp.cfs
│ │ ├── [ 408] _8pp.si
│ │ ├── [ 437] _8r5.cfe
│ │ ├── [ 914008] _8r5.cfs
│ │ ├── [ 408] _8r5.si
│ │ ├── [ 437] _8rf.cfe
│ │ ├── [ 940473] _8rf.cfs
│ │ ├── [ 408] _8rf.si
│ │ ├── [ 437] _8rz.cfe
│ │ ├── [ 1315312] _8rz.cfs
│ │ ├── [ 408] _8rz.si
│ │ ├── [ 437] _8s9.cfe
│ │ ├── [ 1121692] _8s9.cfs
│ │ ├── [ 408] _8s9.si
│ │ ├── [ 437] _8sk.cfe
│ │ ├── [ 243476] _8sk.cfs
│ │ ├── [ 408] _8sk.si
│ │ ├── [ 1678] segments_6
│ │ └── [ 0] write.lock
│ ├── [ 4096] _state
│ │ ├── [ 186] retention-leases-2865.st
│ │ └── [ 125] state-0.st
│ └── [ 4096] translog
│ ├── [ 55] translog-29.tlog
│ └── [ 88] translog.ckp
└── [ 4096] _state
└── [ 1230] state-2.st

5 directories, 118 files

有了文件信息,我们再来看下,segment信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
GET /logstash-2023.05.30/_segments

// 这里为了直观 只展示shard 0对应的segment
{
"_shards": {
"total": 7,
"successful": 7,
"failed": 0
},
"indices": {
"logstash-2023.05.30": {
"shards": {
"0": [
{
"routing": {
"state": "STARTED",
"primary": true,
"node": "4hEWcF8hRFWTEkQxlKQmqg"
},
"num_committed_segments": 17,
"num_search_segments": 17,
"segments": {
"_17f": {
"generation": 1563,
"num_docs": 210331,
"deleted_docs": 0,
"size_in_bytes": 59203502,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": false,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_3uv": {
"generation": 4999,
"num_docs": 278411,
"deleted_docs": 0,
"size_in_bytes": 78098502,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": false,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_5bg": {
"generation": 6892,
"num_docs": 132645,
"deleted_docs": 0,
"size_in_bytes": 38291972,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": false,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_60h": {
"generation": 7793,
"num_docs": 199809,
"deleted_docs": 0,
"size_in_bytes": 57599273,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": false,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_7jq": {
"generation": 9782,
"num_docs": 520420,
"deleted_docs": 0,
"size_in_bytes": 145654675,
"memory_in_bytes": 5204,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": false,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_82w": {
"generation": 10472,
"num_docs": 15416,
"deleted_docs": 0,
"size_in_bytes": 4490224,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_87w": {
"generation": 10652,
"num_docs": 16837,
"deleted_docs": 0,
"size_in_bytes": 4933481,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8ao": {
"generation": 10752,
"num_docs": 48855,
"deleted_docs": 0,
"size_in_bytes": 13906162,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8ls": {
"generation": 11152,
"num_docs": 70903,
"deleted_docs": 0,
"size_in_bytes": 20181892,
"memory_in_bytes": 5140,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8nq": {
"generation": 11222,
"num_docs": 3954,
"deleted_docs": 0,
"size_in_bytes": 1235557,
"memory_in_bytes": 6924,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8oa": {
"generation": 11242,
"num_docs": 2785,
"deleted_docs": 0,
"size_in_bytes": 873643,
"memory_in_bytes": 6820,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8pp": {
"generation": 11293,
"num_docs": 5194,
"deleted_docs": 0,
"size_in_bytes": 1594522,
"memory_in_bytes": 7060,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8r5": {
"generation": 11345,
"num_docs": 2936,
"deleted_docs": 0,
"size_in_bytes": 914853,
"memory_in_bytes": 6748,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8rf": {
"generation": 11355,
"num_docs": 2920,
"deleted_docs": 0,
"size_in_bytes": 941318,
"memory_in_bytes": 6836,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8rz": {
"generation": 11375,
"num_docs": 4304,
"deleted_docs": 0,
"size_in_bytes": 1316157,
"memory_in_bytes": 6820,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8s9": {
"generation": 11385,
"num_docs": 3647,
"deleted_docs": 0,
"size_in_bytes": 1122537,
"memory_in_bytes": 6892,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
},
"_8sk": {
"generation": 11396,
"num_docs": 657,
"deleted_docs": 0,
"size_in_bytes": 244321,
"memory_in_bytes": 7620,
"committed": true,
"search": true,
"version": "8.8.2",
"compound": true,
"attributes": {
"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"
}
}
}
}
]
}
}
}
}

对比segment与shard目录中文件可以看出,两者是一一对应的。

看下es及对应lucene的版本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
GET /

{
"name" : "10.138.204.193-node1",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "XWDyVuo6TgK4yUp2XWD3lw",
"version" : {
"number" : "7.13.4",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "c5f60e894ca0c61cdbae4f5a686d9f08bcefc942",
"build_date" : "2021-07-14T18:33:36.673943207Z",
"build_snapshot" : false,
"lucene_version" : "8.8.2",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

那么shard目录中各种后缀的文件具体是什么含义呢?下面来看下

截图出处:
https://lucene.apache.org/core/8_8_2/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description

从表格中可以看出与FST相关的文件后缀有:tip、tim,从这里就可以看出FST文件是以segment维度来创建的。

欢迎关注我的其它发布渠道